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A growing body of evidence supports the existence of an extensive network of RNA-binding proteins (RBPs) whose 
combinatorial binding affects the post-transcriptional fate of every mRNA in the cell — yet we still do not have a complete 
understanding of which proteins bind to mRNA, which of these bind concurrently, and when and where in the cell they 
bind. We describe here a method to identify the proteins that bind to RNA concurrently with an RBP of interest, using 
quantitative mass spectrometry combined with RNase treatment of affinity-purified RNA-protein complexes. We applied 
this method to the known RBPs Pabl, Nab2, and Puf 3. Our method significantly enriched for known RBPs and is a clear 
improvement upon previous approaches in yeast. Our data reveal that some reported protein-protein interactions may 
instead reflect simultaneous binding to shared RNA targets. We also discovered more than 100 candidate RBPs, and we 
independently confirmed that 77% (23/30) bind directly to RNA. The previously recognized functions of the confirmed 
novel RBPs were remarkably diverse, and we mapped the RNA-binding region of one of these proteins, the transcriptional 
coactivator Mbfl, to a region distinct from its DNA-binding domain. Our results also provided new insights into the roles of 
Nab2 and Puf3 in post-transcriptional regulation by identifying other RBPs that bind simultaneously to the same mRNAs. 
While existing methods can identify sets of RBPs that interact with common RNA targets, our approach can determine which 
of those interactions are concurrent — a crucial distinction for understanding post-transcriptional regulation. 

[Supplemental material is available for this article.] 



Life depends on the coordinated temporal, spatial, and stoichio- 
metric regulation of gene expression. Combinatorial binding by 
specific transcription factors allows for the concerted temporal 
regulation of large sets of genes in physiological and develop- 
mental programs at a transcriptional level. The resulting RNA 
transcripts are also subject to further regulation at the levels of 
RNA processing, transport, localization, translation, and degra- 
dation. The added dimensions of regulation provided by RNA- 
binding proteins (RBPs) enable more precise temporal, spatial, 
and stoichiometric control of protein production (Wang et al. 
2002; Paquin et al. 2007; Jansen et al. 2009; Kurischko et al. 2011). 
Specific RBPs bind to distinct sets of mRNAs, typically encoding 
proteins destined for similar subcellular localizations or with re- 
lated biological functions, suggesting a model in which con- 
certed, combinatorial binding of specific mRNAs by specific sets 
of RBPs can affect the post-transcriptional fate of potentially 
every mRNA in the cell (Hieronymus and Silver 2003; Gerber 
et al. 2004; Ong et al. 2004; Keene 2007a,b; Hogan et al. 2008). 
Despite the many lines of evidence pointing to pervasive post- 
transcriptional regulation of gene expression mediated by RBPs, 
we still do not have a complete understanding of which proteins 



Corresponding authors 
E-mail pbrown@stanford.edu 
E-mail mmann@biochem.mpg.de 

Article published online before print. Article, supplemental material, and pub- 
lication date are at http://www.genome.Org/cgi/doi/1 0.1 1 01 /gr.1 53031 .112. 
Freely available online through the Genome Research Open Access option. 



bind to mRNA, which of these bind concurrently, and when and 
where in the cell they bind. 

Previous global approaches to identify proteins that interact 
with mRNAs in yeast have been mostly focused on in vitro bind- 
ing, mass spectrometry, or computational predictions. Although 
powerful, these techniques may miss complex RNA-protein in- 
teractions assembled in vivo, less abundant RBPs, and RBPs that 
lack domains known to bind RNA (Butter et al. 2009; Scherrer 
et al. 2010; Tsvetanova et al. 2010). In fact, >75% (503 out of 647) 
of the proteins annotated as RBPs lack domains known to bind 
RNA (Tsvetanova et al. 2010). Conversely, despite the fact that 
— 10% of the yeast proteome is annotated as "known" RBPs (an- 
notated in the yeast genome database, experimentally validated, 
or with homology with known RNA-binding domains), some 
proteins not annotated as RBPs nonetheless reproducibly co- 
purify with distinct sets of RNAs in vivo (Hogan et al. 2008). The 
known functions of some RBPs would not suggest their involve- 
ment in the post-transcriptional regulation of RNA. For example, 
the metabolic enzyme aconitase, which catalyzes the isomeriza- 
tion of citrate to isocitrate, also functions as an RNA-binding pro- 
tein, binding to iron regulatory elements in target mRNAs to reg- 
ulate their translation or stability in response to iron availability 
(Hentze et al. 1987a,b; Casey et al. 1988; Leibold and Munro 1988; 
Rouault et al. 1989; Bertrand et al. 1993). Previous work using 
protein microarrays to search for new RNA-binding proteins in 
yeast identified additional unexpected RBPs, including several 
enzymes (Scherrer et al. 2010; Tsvetanova et al. 2010). Recently, 
two papers used mass spectrometry to identify hundreds of novel 
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RBPs in human cells (Baltz et al. 2012; Castello et al. 2012). These 
and other examples suggesting regulatory RNA-binding activity in 
unexpected proteins highlight the need for additional experi- 
mental methods to enable the quantitative, unbiased, and accurate 
discovery of novel RNA-protein interactions from complexes as- 
sembled in vivo. 

The post-transcriptional operon model hypothesizes that 
the fate of a given mRNA molecule is influenced by the con- 
certed, combinatorial binding of specific RBPs (Keene 2007a,b) — 
yet we know surprisingly little about which RBPs bind to mRNAs 
concurrently. It is thought that the specific complement of RBPs 
bound to a given mRNA specifies its post-transcriptional fate, but 
nearly all existing data are limited to defining pairwise interac- 
tions between a single RBP and a single mRNA species. Previous 
work to identify the mRNA targets bound by individual RBPs has 
mostly relied on purification of the RBP from a whole-cell lysate 
followed by analysis of the copurifying mRNAs (Gerber et al. 
2004; Ule et al. 2005; Keene 2007a,b; Hogan et al. 2008; Bohnsack 
et al. 2009; Granneman et al. 2009, 2010; Wolf et al. 2010; Scherrer 
et al. 2011; Schenk et al. 2012). These approaches do not differen- 
tiate between two RBPs that bind simultaneously to their com- 
mon mRNA targets and two RBPs that bind to a common set of 
mRNA targets but at different times or in different cellular loca- 
tions. This limits our understanding of post-transcriptional reg- 
ulation, because from birth to death the average mRNA molecule 
is estimated to be bound by at least 10 different known RBPs 
during the entirety of its processing, export, transport, localiza- 
tion, translation, and degradation (Hogan et al. 2008). The post- 
transcriptional regulatory network is determined not only by 
which RBPs bind to a given mRNA, but in what temporal pro- 
grams and in what combinations with other RBPs. Identifying 
well-characterized RBPs that bind mRNAs simultaneously with 
an RBP of unknown role would provide immediate clues to its 
functions. For example, if an uncharacterized RBP binds con- 
currently with RBPs known to be involved in splicing, the unchar- 
acterized RBP can be inferred to bind in the nucleus during splic- 
ing and possibly play a role in splicing. 

Mass spectrometry (MS)-based proteomics is a powerful tool 
for studying cellular interactions, especially if used in a quantita- 
tive format. Stable isotope labeling of amino acids in cell culture, 
SILAC (Mann 2006), is one such quantitative proteomics tech- 
nology, and it can be used to detect selective enrichment. This 
technique has been applied to GFP-tagged proteins (Trinkle- 
Mulcahy et al. 2008; Hubner et al. 2010), modified peptides 
(Schulze and Mann 2004), DNA (Mittler et al. 2009), and RNA 
(Butter et al. 2009; Baltz et al. 2012; Castello et al. 2012; Scheibe 
et al. 2012) to identify previously unknown binders. Here we 
used quantitative mass spectrometry combined with RNase treat- 
ment of affinity-purified RNA-protein complexes assembled in 
vivo to identify the proteins that bind to RNA concurrently with the 
known RBPs Pabl, Nab2, and Puf3. 



Results 

A quantitative proteomic method for identifying 
RNA-dependent protein interactions 

We used quantitative mass spectrometry to identify the proteins 
that copurify with a protein of interest in an RNA-dependent 
manner (Fig. 1). We first purified a TAP-tagged protein by IgG- 
protein-A affinity purification from a "light" (unlabeled) cell lysate 
and from a "heavy" lysate labeled by incorporation of 13 C and 
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Figure 1. A method for identifying RNA-dependent protein inter- 
actions. An overview of our proteomic method for identifying RNA- 
dependent protein interactions. A yeast strain with a TAP-tagged protein 
of interest is grown in media labeled with heavy ( 13 C and 15 N isotope 
enriched) lysine or unlabeled media. The cells are lysed and the protein of 
interest is purified using the TAP tag. The unlabeled sample is treated with 
RNase, and the heavy labeled sample is not. The beads are then boiled in 
SDS-PAGE buffer to release any bound proteins and combined as heavy 
labeled RNase untreated and unlabeled treated with RNase (we also 
performed the inverse as a replicate and to control for labeling-related 
artifacts). The heavy-to-light ratio measured by mass spectrometry in- 
dicates the fraction of the bound protein that was liberated by RNase 
treatment. 



N isotope-enriched lysine. We then divided the IgG beads with 
the associated TAP-tagged protein into two equal parts and di- 
gested one of them with RNase. Finally, we combined heavy- 
labeled lysate not treated with RNase with light RNase-treated lysate 
and quantified the heavy-to-light SILAC ratio by mass spectrometry. 
By design, this assay specifically measures enrichment due to RNA- 
dependent association with the TAP-tagged protein. The reverse or 
'label-swapped' experiment, where instead a heavy-labeled RNase- 
treated lysate was combined with light (unlabeled) lysate without 
RNase treatment, served as a replicate and a control for contam- 
inant proteins that are unlabeled in both experiments. 

The resulting heavy-to-light SILAC ratios are a measure of 
the RNA-dependent copurification of a given protein with the 
TAP-tagged protein of interest. When the heavy labeled sample 
is not treated with RNase and the light sample is treated with 
RNase, proteins that are lost from the beads in response to RNase 
treatment will be present more in the heavy labeled sample than 
the light sample. Consequently, proteins will tend to have heavy- 
to-light ratios greater than one if they copurify with the TAP-tagged 
protein of interest in an RNA-dependent manner. For the reverse 
experiment, in which the heavy labeled sample is treated with 
RNase and the light sample is not, proteins will have heavy-to-light 
ratios less than one if they copurify with the TAP-tagged protein 
of interest in an RNA-dependent manner. To make the results of 
these replicates directly comparable, we invert the heavy-to-light 
ratios in the reversed experiment. For simplicity, we represented 
RNA dependence as the ratio of (-) RNase to (+) RNase, so that 
RNA-dependent binders would always be expected to have (-/+) 
RNase ratios greater than one if they copurify with the TAP-tagged 
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protein of interest in an RNA-dependent manner, regardless of the 
labeling scheme. 

We used this method to identify proteins that interact in an 
RNA-dependent manner with the RBPs Pabl, Nab2, or Puf3, re- 
spectively. Pabl and Nab2 have each been shown to bind to 
more than a thousand different mRNAs, while Puf3 binds to a 
smaller, highly specific set of mRNAs (Gebauer and Hentze 2004; 
Gerber et al. 2004; Hogan et al. 2008). To assess the scale and 
reproducibility of the data, we plotted the RNA dependence of 
each protein as the (log 2 ) (— /+) RNase ratios from the two repli- 
cate experiments for Pabl, Nab2, and Puf3 (Fig. 2). In these plots, 
the reproducible RNA-dependent binders (RDBs) form a tail along 
the diagonal, while proteins that interact directly with the tagged 
protein, independent of RNA, are clustered around the origin. As 
a standard measure of RNA-dependent association with the TAP- 
tagged protein, we first normalized the (-/+) RNase ratios to set 
the ratio for the TAP-tagged protein itself to one, based on the 
premise that enrichment of the TAP-tagged protein itself should 
not be RNA dependent. We then averaged the (-/+) RNase ratios 
in both replicate experiments and used the base 2 logarithm 
of this value as our standard measure of RNA-dependent associ- 
ation with the TAP-tagged protein (referred to as RNA-dependence 
values). 

To initially evaluate the performance of this assay, we com- 
pared the distribution of RNA-dependence values for proteins 
annotated as RBPs and proteins without such an annotation 
(Supplemental Fig. SI). The RNA-dependence values for annotated 
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Figure 2. Overview of RNA-dependent interaction data. A scatterplot of 
the RNA-dependent enrichment values as the log base 2 (-/+) RNase 
ratios for two replicates (with inverted labeling schemes) of each protein 
that we purified. The reproducible RNA-dependent binders form a tail in 
quadrant 1 . (A) A plot of example data with a green box indicating the 
quadrant where RNA-dependent binders are expected to be found and 
a blue circle indicating where RNA-independent binders are expected to 
be found. These colored regions are broad generalizations only and were 
not used for actual data analysis. (8) A scatterplot of the log base 2 (— /+) 
RNase ratios for the replicate experiments with Pabl . The points repre- 
senting the proteins Pabl, Nab2, and Puf3 are highlighted in red, yellow, 
and green, respectively. (C) The same scatterplot for experiments with 
the protein Nab2. (D) The same scatterplot for experiments with the 
protein Puf3. 



RBPs were significantly shifted toward higher values in the Pabl, 
Nab2, and Puf3 purifications (P-values 4xl0~ 8 , 2xl0~ 5 , and 
3 x 10~ 7 , respectively), showing that the method enables RNA- 
dependent binders (RDBs) to be identified and that proteins with 
larger RNA-dependence values are more likely to be annotated as 
RBPs. Although results from traditional mass spectrometry have 
frequently been biased by protein abundance, we found no cor- 
relation between protein abundance and the RNA-dependence 
values (Supplemental Fig. S2). This demonstrates that our clas- 
sification of the proteins we detected as RDBs was not affected 
by their abundance. 

To establish a conservative cutoff for the classification of 
proteins into RNA-dependent and RNA-independent binders, we 
created a null distribution by modeling RNA-dependence values 
for proteins with RNA-independent interactions with Pabl, Nab2, 
and Puf3. To do this, we made two assumptions: first, that after 
normalization any RNA-dependence values less than zero have 
a true value of zero and the observed variation from zero is due 
to noise; and second, that this noise is symmetric about zero (see 
Methods). We used the null distribution as the basis for esti- 
mating an empirical false discovery rate (FDR) for classification 
of proteins as RDBs, with an FDR threshold of 10% (Supplemental 
Fig. S3). 

At least half of the proteins classified as RDBs based on our 
10% empirical FDR threshold were proteins known to bind RNA 
(Fig. 3A). In the combined data set, there were 220 RDBs, 48% 
of which were known RNA-binding proteins. This represents a 
significant enrichment of known RBPs relative to the set of all 
proteins that can be detected by mass spectrometry from a yeast 
whole-cell lysate (—15%, hypergeometric P-value 2 x 10~ 35 ) 
(Supplemental Table S7; de Godoy et al. 2008). We also examined 
a published data set of "high-confidence" protein-protein in- 
teractions based on large-scale affinity mass spectrometry studies 
(Gavin et al. 2006; Krogan et al. 2006; Collins et al. 2007), and we 
discovered that the majority of the previously published physically 
interacting proteins with Pabl and Nab2 that we detected in our 
purifications were actually RNA dependent, suggesting that pro- 
tein interactions involving RNA-binding proteins (especially those 
that bind to thousands of different RNAs) may often be indirect 
and mediated by concurrent binding to RNA molecules (see the 
Supplemental Material for further information). 

The experiments described used a buffer containing EDTA, 
and we next performed the Pabl IP experiment in a buffer con- 
taining magnesium. This led to a highly significant enrichment 
of known RNA-binding proteins composed almost entirely of ri- 
bosomal proteins and proteins involved in the initiation, elonga- 
tion, and termination of translation (Supplemental Table S6). The 
majority of these proteins were not observed as RNA-dependent 
binders in experiments done in the presence of EDTA, in which 
ribosomes are no longer assembled on mRNA. These data provide 
a unique perspective into Pabl-containing RNA-protein complexes 
involved in translation. 

The high frequency of known RBPs among the 220 RDBs 
identified in this study contrasts with a frequency of —20% 
known RBPs among the 220 highest-ranking hits identified in 
two previous studies using protein microarrays (including one 
method developed by members of our group) (Scherrer et al. 
2010; Tsvetanova et al. 2010). Despite this difference, our 220 
RDBs are significantly enriched in the protein microarray data 
from Tsvetanova et al. (2010) and also from Schener et al. (2010) 
(Supplemental Fig. S5; Wilcoxon P-values 0.005 and 0.009, respec- 
tively). However, there was no Spearman rank correlation between 
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Figure 3. Barplots showing enrichment of known RNA-binding proteins and the RNA dependence of 
published protein-protein interactions. (A) The fraction of proteins we identified as RNA-dependent 
binders that are known RBPs (defined as those that have domains known to bind RNA or have a mo- 
lecular function of RNA binding in the Gene Ontology database). Known RNA-binding proteins are 
significantly enriched among the proteins interacting in an RNA-dependent manner with Pabl, Nab2, 
and Puf3 (for example, P-value of 2 x 10~ 35 for the union of all three data sets). (8) The percentage 
of proteins known to bind RNA in the set of 220 RNA-dependent binders in our combined data set, 
compared with the top 220 proteins from the protein array data from two previously published attempts 
to identify proteome-wide RNA-protein interactions. All had significant enrichment of known RBPs, but 
the enrichment seen with the set of proteins identified by our method was much greater (hyper- 
geometric P-values 2 x 10~ 35 , 2 x 10~ 4 , and 2 x 10~ 5 for this study, Scherreretal. 2010, and Tsvetanova 
et al. 2010, respectively). (C) A barplot showing the RNA dependence of published high-confidence 
protein-protein interactions and also the number of novel interactions (RNA dependent) we observed 
with Pabl, Nab2, and Puf3. 



our data and that from either protein array data set. These results 
suggest that our method both corroborates and extends previous 
work identifying RNA-interacting proteins. 

We analyzed the enrichment of Gene Ontology (GO) terms, 
protein domains from the protein families database (PFAM), and 
biological pathways from the Kyoto Encyclopedia of Genes and 
Genomes (KEGG) relative to all the proteins that could be de- 
tected from an analysis of a yeast whole-cell lysate (Fig. 4). Our 
data clearly partition the proteins into groups with strong ties 
to RNA-dependent or independent binding as evidenced by the 
enrichment of GO terms, PFAM domains, and KEGG pathways. 
The proteins we classified as having RNA-dependent interactions 
with Pabl, Nab2, or Puf3 were enriched for Gene Ontology (GO) 
terms referring to RNA-related biological processes and molecular 
functions, such as RNA binding, transcription, splicing, trans- 
lation, and decay (Fig. 4). Importantly, the majority of these RNA- 
related GO terms were not similarly enriched among the pro- 
teins falling below the threshold we set for classification as RNA- 
dependent binders, again demonstrating that our method had 
successfully separated these proteins based on their ability to bind 
RNA. The proteins with RNA-dependent interactions were also 
enriched for several protein domains known to bind RNA, DNA, 
or nucleic acid in general, such as SWIRM nucleic acid-binding 
domains, LSM RNA-binding domains, RNA recognition motif do- 
mains, MIF4G protein- and nucleic acid-binding domains, La RNA- 
binding domains, and DEAD/DEAH-box helicase domains. None 
of these domains were enriched among the proteins falling below 
the cutoff for RNA-dependent interactions. 

Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway 
enrichment provides further insight into the functional roles of 
Pabl, Nab2, and Puf3. Specifically, the proteins that interact in an 
RNA-dependent manner with Pabl and Nab2, but not those that 



interact with Puf3, were enriched for 
the KEGG pathways "Spliceosome" and 
"RNA polymerase" (Fig. 4). Nab2 and 
Pabl have been implicated in mRNA 
end processing and polyadenylation, 
and they are generally believed to bind 
to their mRNA targets at this stage (Hector 
et al. 2002; Brune et al. 2005; Dunn et al. 
2005; Iglesias and Stutz 2008; Tutucci 
and Stutz 2011). However, our evidence 
that they associate simultaneously with 
RNA polymerase and spliceosomes po- 
tentially indicates that these proteins 
may, in fact, bind earlier, during tran- 
scription. This enrichment was not seen 
among the RNA-dependent interactions 
with Puf3, suggesting that Puf3 binds 
later in the life of its mRNA targets. 
DNA-binding and transcription-related 
GO terms were also enriched among 
proteins that we found to have RNA- 
dependent interactions with Nab2 or 
Pabl, but not with Puf3 — further evi- 
dence that Pabl and Nab2 bind cotran- 
scriptionally to nascent transcripts, but 
that Puf3 does not. Conversely, the KEGG 
RNA degradation pathway annotation 
was specifically enriched among the 
proteins interacting in an RNA-dependent 
manner with Puf3, consistent with the 
known role of Puf3 in promoting the degradation of its mRNA 
targets (Gerber et al. 2004; Lee et al. 2010). 



Identification and validation of novel RNA-binding proteins 

The strong enrichment of known RNA-binding proteins that we 
observed among the 220 RNA-dependent binders (Fig. 3A) makes 
it likely that most of the 114 proteins in this group that are 
not currently annotated as RNA-binding proteins also bind RNA 
(Supplemental Fig. S6; Supplemental Table S5). To test whether 
these candidate RBPs bind directly to RNA, we used a method 
based in part on previous approaches (Greenberg 1979, 1980; Ule 
et al. 2005) that combines UV cross-linking, affinity purification, 
RNase treatment, polynucleotide kinase labeling with 32 P, and 
denaturing SDS-PAGE electrophoresis (Supplemental Fig. S7). 
This method allows us to identify whether a candidate RBP makes 
direct contact with RNA (within 1 A) (Pramanik and Bewley 1996; 
Ule et al. 2005). We tested 25 of the 76 candidate RBPs that were 
not reported to interact physically with known RBPs as well as 
five of the 38 candidates that have been reported to interact physi- 
cally with known RNA-binding proteins. We also included 10 
known RBPs as positive controls and five putative negative con- 
trol proteins that were selected from among highly abundant 
proteins (95th percentile for abundance) for which we had no 
evidence to suggest that they bind RNA. 

We quantified any detectable radioactive bands on our de- 
naturing SDS-PAGE gels corresponding to the candidate RBPs and 
analyzed the relationship between the molecules of cross-linked 
RNA that were detected and an estimate of the molecules of each 
protein present (based on data from the Saccharomyces Genome 
Database [Ghaemmaghami et al. 2003; de Godoy et al. 2008; Cherry 
et al. 2012] and described in Methods). This revealed a correlation 
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cules of RNA than would be expected 
based on their high abundance, placing 
them in the bottom 10% of cross-linking 
efficiency out of all 45 tested proteins 
(Table 1). 

We used the cross-linking efficiency 
of the negative controls to set a thresh- 
old for validated RNA-binding proteins. 
There were 23 candidate RBPs above this 
threshold (out of the 25 with detectable 
cross-linking); their cross-linking efficien- 
cies ranged from —0.1 to 0.001%. Re- 
markably, the four proteins that cross- 
linked with the highest efficiency among 
all 45 proteins we tested, including 10 
known RBPs, were newly identified can- 
didate RBPs (Table 1). We also found that 
whether or not a candidate RBP had been 
reported to interact physically with a 
known RBP was not a strong predictor 
of whether it could be cross-linked directly 
to RNA in our assay (4/5 for the potential 
indirect binders and 19/25 for the others). 
Overall, -77% (23 of 30) of the candidate 
RBPs cross-linked to RNA with higher effi- 
ciency than the negative controls, pro- 
viding strong evidence that the majority 
of the novel candidate RBPs discovered 
in this study may bind directly to RNA. 

None of these 23 validated novel 
RBPs have any known RNA-binding do- 
mains. While many of these proteins 
have unidentified molecular functions, 
those with known roles and pathways 
are remarkably diverse, including a vesi- 
cle trafficking protein (Secl6), a tran- 
scription factor (Mbfl), a DNA-binding 
protein (Stml), two helicases (Ecm32, 
Slhl), a metabolic enzyme (Imd4), a GTPase 
(Vpsl), and a histone acetyltransferase 
(Eaf3) (Table 1). The unsuspected RNA- 
binding activity of dozens of proteins 
found here underscores the need for un- 
biased methods for discovering novel 
RNA-binding proteins. 
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Figure 4. Differential enrichment of Gene Ontology terms, PFAM domains, and KEGG pathways. 
A heatmap showing the enrichment of Gene Ontology terms, PFAM domains, and KEGG pathways 
among the proteins we identified as RNA-dependent binders and those that were not (labeled "Yes" 
and "No," respectively). Enrichment of Gene Ontology terms, PFAM domains, and KEGG pathways is 
depicted in green, red, and blue, respectively. Colors correspond to the negative log base 1 0 of the 
hypergeometric P-values. The columns are enrichment seen among proteins interacting in an RNA- 
dependent manner with Puf3, Pabl , or Nab2. 



between protein abundance and molecules of RNA cross-linked 
(Pearson correlation coefficient of 0.8 for known RBPs and 0.7 for 
all proteins), although there was a large variation in the amount 
of RNA that could be cross-linked for proteins at the same abun- 
dance level. For example, the known RBPs She3 and Puf3 have 
similar protein abundance (—1000 and —850 molecules per cell, 
respectively), but Puf3 cross-links to —40-fold more molecules of 
RNA than She3. For more discussion of these differences in cross- 
linking efficiency, see the Supplemental Material. The two negative 
controls with detectable bands cross-linked to far fewer mole- 



The transcriptional coactivator Mbfl 
cross-links to RNA in a region distinct 
from its DNA-binding domain. 

Since none of these validated RNA-bind- 
ing proteins have homology with protein 
domains known to bind RNA, we sought to identify the region 
of each protein that cross-links to RNA. We performed a partial 
protease digest of purified proteins to reveal structured regions 
cross-linked to radioactively labeled, exhaustively digested RNA 
fragments. We then analyzed the digestion products by SDS-PAGE 
to find distinct bands representing ordered domains of the pro- 
tein (Fontana et al. 2012). Next, we measured the radioactivity 
of each of these bands to determine if they were cross-linked to 
RNA. Protease digestion of Mbfl produced fragments that could 
be resolved as three distinct bands by SDS-PAGE, corresponding 
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Table 1. Results of UV-cross-linking assay 




Molecular Function 



Description 



Vps1 
Aep1 
Rbg1 
Eaf3 
Syp1 
Aep2 
Iki1 
Yhr097c 



0.002% 
0.002% 
0.001% 
0.001% 
0.001% 
0.001% 
0.0006% 
No protein 
quant. 



unknown 
unknown 
unknown 
unknown 
Pumilio 
RRM 
unknown 
unknown 
unknown 
unknown 
RRM 
RRM 
RRM 
unknown 
unknown 
unknown 
unknown 
unknown 
unknown 
unknown 
Sm-like 
unknown 
RRM 
unkno 1 
unknown 
unknown 
unknown 
unknown 
unknown 
unknown 
unknown 
unknown 



unknown 
protein anchor 
unknown 

unknown 

mRNA binding 
RNA binding 
Rho GTPase activator 
poly(A) RNA binding 
DNA/RNA helicase 

unknown 

RNA binding 
mRNA/poly(U) binding 
poly(A) RNA binding 
unknown 

Putative RNA helicase 

unknown 

unknown 

IMP dehydrogenase 
DNA binding 
transcription coactivator 
RNA binding 

structural molecule activity 
mknown 
mRNA binding 
GTPase 
unknown 
GTP binding 

histone acetyltransf erase 

unknown 

unknown 

unknown 

unknown 



Putative helicase with limited sequence similarity to human Rb protein 
COPII vesicle coat protein required for ER transport vesicle budding 
Topoisomerase ll-associated deadenylation-dependent mRNA-decapping factor 
High-copy suppressor of group II intron-splicing defects 

mitochondrial, promotes degradation of nuclear-encoded mitochondrial mRNAs 
involved in the export of mRNAs from the nucleus to the cytoplasm 
involved in the control of cytoskeleton organization and cellular morphogenesis 
required for nuclear mRNA export and poly(A) tail length control 
DNA dependent ATPase/DNA helicase involved in modulating translation termination 
deletion causes mitochondrial defects but none in growth or translation initiation 
involved in the export of mRNAs from the nucleus to the cytoplasm 
abundant mRNP-component that is required for stability of many mRNAs 
part of 3'-end RNA-processing, mediates interactions between the 5' cap and the 3' tail 
possible role in delivering mitochondrial mRNAs to ribosomes 
related to Ski2p, involved in translation inhibition of non-poly(A) mRNAs 
Protein of unknown function thought to be involved in endocytosis 
Protein of unknown function, mutant phenotype suggests a role in vacuolar function 
catalyzes the first step of GMP biosynthesis 
protein required for optimal translation under nutrient stress 
bridges the DNA-binding region of Gcn4p and TATA-binding protein Spt15p 
may have a role in RNA processing 
Nuclear pore complex (NPC), involved in protein import/export and in export of RNAs 
Protein involved in G2/M phase progression and response to DNA damage 
part of the mRNA localization machinery for certain bud localized proteins 
required for vacuolar sorting 

Protein required for expression of the mitochondrial subunit 9 of F1-F0 ATP synthase 
member of the DRG family, associates with translating ribosomes 
nonessential component of the NuA4 acetyltransf erase complex, DNA repair 
involved in endocytic site formation, may regulate assembly/disassembly of septin ring 
likely involved in translation of the mitochondrial subunit F1-F0 ATP synthase mRNA 
Subunit of Elongator complex, required for modification of wobble nucleosides in tRNA 
Putative protein of unknown function 



03% unknown peptidyl-prolyl isomerase 

2% unknown amino-acid transaminase 

35 Tom71 0.0001% unknown protein transporter 

36 Bmh1 0.00003% unknown DNA rep. origin binding 

37 Gcd11 No band unknown translation initiation factor 

38 Cin8 No band unknown (+/-) end microtubule motor 

39 Ppt1 No band unknown protein Ser/Thr phosphatase 

40 Tfa2 No band unknown ssDNA binding 

41 Vam6 No band unknown Rab guanyl exchange factor 

42 Mip6 No band RRM RNA binding 

43 Ccw12 No band unknown unknown 

44 Erg 11 No band unknown sterol 14-demethylase 

45 Lcb2 No band unknown palmitoyltransferase 



peptidyl-prolyl cis-trans isomerase 

mitochondrial branched-chain amino acid (BCAA) aminotransferase 
mitochondrial outer membrane protein with similarity to Tom70p 
14-3-3 protein, major isoform, binds proteins and DNA 
Gamma subunit of elF2, involved in the identification of the start codon 
Kinesin motor protein involved in mitotic spindle assembly and chromosome segregation 
nucleus and cytoplasm, potential role in phosphate metabolism and rRNA processing 
TFIIE small subunit, involved in RNA polymerase II transcription initiation 
Vacuolar protein that plays a critical role in the tethering for vacuolar membrane fusion 
interacts with Mex67, a component of the nuclear pore involved in mRNA export 
Cell wall mannoprotein, role in maintenance of newly synthesized areas of cell wall 
lanosterol 14-alpha-demethylase in the ergosterol biosynthesis pathway 
responsible for the first committed step in sphingolipid synthesis 



(Green) The known RBPs; (red) negative controls; (white) candidate RBPs. Cross-linking efficiency was calculated by comparing the molecules of cross- 
linked RNA to the molecules of protein. The proteins that had detectable bands in our UV-cross-linking assay are ranked by cross-linking efficiency. 
(Horizontal black line) The empirical cutoff we drew based on the highest cross-linking efficiency observed among the negative control proteins. There 
were 23 candidate RBPs above this cutoff. Note that while Yhr097c has an unknown protein abundance, it cross-linked to a similar number of molecules of 
RNA as the negative controls (which are above the 95th percentile for protein abundance). Therefore, there is a >95% chance that Yhr097c cross-links 
with higher efficiency than the negative controls. 



to putative stable digestion products (Fig. 5A). Bands 2 and 3 had 
strong signals from the cross-linked, radiolabeled RNA, while 
band 1 did not (Fig. 5A). We excised these three bands (and un- 
digested Mbfl) from the gel and analyzed them by mass spec- 
trometry, comparing the enrichment of each peptide relative to 
undigested Mbfl for each band after normalization (Fig. 5B). This 
identified a region at the N terminus in the multiprotein bridging 
factor (MBF) domain that was ~ 10-fold enriched in bands 2 and 3 
but not band 1 (Fig. 5B). Conversely, band 1, which did not cross- 
link to RNA, displayed approximately twofold enrichment for pep- 
tides derived from the helix-turn-helix DNA-binding domain. 
These results imply that the RNA-binding domain of Mbfl is dis- 
tinct from its DNA-binding domain, suggesting that Mbfl could 
potentially bind simultaneously to DNA and RNA. 

A large fraction of the RNA-dependent binders that we iden- 
tified are annotated as DNA-binding proteins, including several 
transcription factors such as Mbfl (Supplemental Fig. S6). We 
speculate that the RNA-dependent binders that also bind DNA 
may operate to connect the post-transcriptional regulatory net- 
work to the transcriptional regulatory network, by first binding 



DNA to regulate transcription and subsequently binding to the 
nascent RNA to affect its stability or translation in the cytoplasm. 
Indeed, recent reports provide evidence that transcriptional regu- 
lation can affect post-transcriptional regulation in yeast (Harel- 
Sharvit et al. 2010; Bregman et al. 201 1; Choder 201 1). In addition, 
a connection between the transcription and the processing 
of RNA has long been known to exist (Cramer et al. 1997; 
McCracken et al. 1997). 

Analysis of the proteins that bind to RNAs concurrently 
with Nab2 or Puf 3 expands on the existing models of Nab2 
and Puf3 function in post-transcriptional regulation 

While RNA immunoprecipitation methods (RIP-chip, CLIP-seq, 
and related) can identify specific interactions between RNAs and 
RNA-binding proteins, they cannot identify whether the multiple 
proteins that interact with a given RNA bind concurrently, se- 
quentially, or in mutually exclusive cellular locations. In contrast, 
our RNA-dependent interaction data enable us to directly identify 
pairs of proteins that bind concurrently to one or more RNAs in 
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Figure 5. Mapping the RNA-binding domain of Mbfl . (A) SDS-PACE 
analysis of protein fragments resulting from partial digestion of Mbfl 
with the protease chymotrypsin. The lanes from left to right contain the 
ladder, chymotrypsin only, the supernatant of protein fragments liberated 
by chymotrypsin digestion of Mbfl , and the protein fragments remaining 
on the beads. A total protein stain is shown on the left, and the radioactive 
image of the same gel is shown on the right. Gel images were scaled and 
aligned to facilitate direct comparison of the visible bands. The radioactive 
image shows a signal from 32 P-labeled RNA fragments cross-linked to 
Mbfl . (8) A diagram showing the domains of Mbfl and the position 
and enrichment relative to full-length Mbfl for all the peptides that 
were detected in each sample. The fold enrichment of the normalized 
intensity of each peptide relative to undigested Mbfl for each of the three 
bands is represented by a color gradient ranging from dark blue for less 
than one-tenth fold enriched, to white for no enrichment, to dark red for 
greater than 10-fold enriched. (Gray) Areas of the protein for which no 
peptides were detected. 



a cell. Together with other information about these RBPs, this can 
provide clues to its function and when and where in the cell it binds. 

Nab2 is involved in the end processing, polyadenylation, 
and export of poly(A) mRNA from the nucleus (Green et al. 2002; 
Hector et al. 2002; Fasken et al. 2008; Iglesias and Stutz 2008; 
Tutucci and Stutz 2011). It is generally believed that Nab2 binds 
to mRNAs during their end cleavage and polyadenylation and 
is removed immediately following their nuclear export (Lee and 
Aitchison 1999; Tran et al. 2007). Previous work also revealed that 
the mRNAs bound by Nab2 tend to encode nuclear-localized pro- 
teins involved in transcription and splicing (Guisbert et al. 2005; 
Hogan et al. 2008). We confirmed several known RNA-dependent 
interactions with Nab2 in our data and uncovered novel RNA- 
dependent interactions that were consistent with the known role 
of Nab2 in end processing and mRNA export, such as THO/TREX 
complex components, Mex67, Mtr2, and Nupl (Fig. 6). 

Using our data to extend the existing model of Nab2 func- 
tion, we looked for novel RNA-dependent interactions between 
Nab2 and other well-studied RNA-binding proteins involved in 



processes other than mRNA polyadenylation and export (Fig. 6). 
We detected novel RNA-dependent interactions between Nab2 
and several protein components of the splicing apparatus (Smbl, 
Smdl, Smd2, Smd3, Smx3, Cefl, Luc7, Msl5, Prpl9, Prp22, Prp39, 
and Yhcl) (Fig. 6). We also found RNA-dependent interactions 
with proteins involved in transcription or the regulation of tran- 
scription (Tfa2, Arp9, Gatl, Mbfl, Met28, and the RNA polymerase II 
central core component Rpb2) (Fig. 6). These interactions appear 
to be specific to Nab2, because most are not seen with Pabl (5/5 Sm 
proteins, 1/7 other splicing, 1/6 transcription related) or Puf3 (0 out 
of 18). Nab2's unexpected RNA-dependent interactions with these 
proteins involved in splicing and transcription suggest that in 
some cases Nab2 may bind earlier than generally believed, per- 
haps cotranscriptionally. We also find a novel RNA-dependent 
interaction between Nab2 and the nuclear exosome core com- 
ponent Rrp6, suggesting that Nab2 remains associated with some 
mRNAs when they are targeted for surveillance or degradation. 
Finally, while in vitro experiments and genetic interactions have 
led to the model that Nab2 is removed from its mRNA targets by 
helicases anchored on the cytoplasmic face of the nuclear pore 
complex (Tran et al. 2007), we discovered novel RNA-dependent 
interactions between Nab2 and proteins involved in translation 
and the repression of translation, such as Tif4631, Tif4632, Cdc33, 
Sbpl, Khdl, and Pabl. This suggests that in some cases Nab2 re- 
mains bound to its targets after mRNA export (Fig. 6). These re- 
sults illustrate how analyzing the well-studied RBPs that bind con- 
currently with Nab2 can expand the model of Nab2 function and 
refine our view of when in the life of its mRNA targets it binds. 

Applying a similar approach to Puf3 identifies several novel 
RNA-dependent interactions that extend and refine the known 
role of Puf3 in repressing the expression of its mRNA targets. Puf3 
promotes the decay and localization of its mRNA targets (Gerber 
et al. 2004; Saint-Georges et al. 2008; Lee et al. 2010; Quenault et al. 
2011). It also physically interacts with decay proteins such as the 
major cytoplasmic deadenylase complex Ccr4-Not (in an RNA- 
independent manner) (Lee et al. 2010). Our method has revealed 
that in addition to its role in promoting decay and localization, 
Puf3 binds to mRNAs concurrently with proteins involved in 
translation and translational repression, namely, Tif4631, Tif4632, 
Cdc33, Patl, and Stml (Fig. 7). We have also discovered novel 
RNA-dependent interactions between Puf3 and the P-body and 
RNA decay-related proteins, Xrnl and the Lsm ring complex 
(Fig. 7). Finally, we learned that Puf3 can bind to mRNAs con- 
currently with the stress granule proteins Sgnl and Publ (Fig. 7). 
Puf3 can promote the deadenylation and decay of its mRNA targets 
independent of Ccr4 (Lee et al. 2010). It has been hypothesized to 
recruit an as-yet-unknown factor or factors to promote the rear- 
rangement of the mRNP structure from a pro-translation/stability 
state into an anti-translation/decay state (Lee et al. 2010). Given 
that Stml and the Patl/Lsm-ring complex are involved in the 
repression of translation and promote mRNA decapping/decay 
(Marnef and Standart 2010; Balagopal and Parker 2011), we specu- 
late that these proteins may be the undiscovered factors that 
Puf3 recruits to its target mRNAs to promote their degradation. 
Going beyond the known role of Puf3, we found novel RNA- 
dependent interactions between Puf3 and proteins involved in 
repressing translation, suggesting that Puf3 may also repress the 
expression of its mRNA targets at the translational level. These 
vignettes illustrate how our data provide a unique perspective into 
the makeup of the RNA-protein complexes in which an RBP of 
interest is found and highlight the value of this technique for the 
study of post-transcriptional regulation. 
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Figure 6. A revised model for Nab2 activity. This diagram depicts a subset of the proteins that we 
found to interact with Nab2 in an RNA-dependent or independent manner. We used our observations 
to expand on previous models of Nab2 function. Proteins separated by a black line (the RNA) with Nab2 
had RNA-dependent interactions. When limited to well-studied proteins known to bind RNA, this 
RNA-dependent interaction data suggest that these RBPs bind to the same RNAs at the same time as 
Nab2. Proteins were placed in this diagram according to their known roles in RNA processing and 
regulation. Note that the RNA-independent interactions we detected between Nab2 and Mlpl and 
Kap104 are also shown because of their known roles in Nab2 function. 



Discussion 

A growing body of evidence suggests that post-transcriptional 
regulation mediated by RBPs is a widespread phenomenon, but 
how this happens largely remains to be discovered. It is clear from 
the many published examples of regulatory RNA-binding activity 
in unexpected proteins that we need methods to enable the un- 
biased discovery of novel RNA-protein interactions. A prevail- 
ing model of post-transcriptional regulation is that the specific 
complement of RBPs bound to a given mRNA specifies its post- 
transcriptional fate — yet nearly all existing data are limited to 
defining pairwise interactions between a single RBP and a single 
mRNA species, potentially missing vital information about this 
aspect of post-transcriptional regulation. 

Here we developed a method that characterizes RNA-protein 
interactions from a different perspective. It combines quantitative 
mass spectrometry with RNase treatment of affinity-purified 
RNA-protein complexes assembled in vivo. We interrogated the 
constituents of RNA-protein complexes containing the known 
RNA-binding proteins Pabl, Nab2, or Puf3, respectively, pro- 
viding a new perspective on the role of Nab2 and Puf3 in post- 
transcriptional regulation. 

Our data revealed a large and diverse group of previously 
unrecognized RNA-binding proteins and showed that the ma- 
jority of previously reported protein-protein interactions involv- 
ing Pabl or Nab2 that we could detect are, in fact, RNA dependent. 
We extrapolate that other reported protein-protein interactions, 



especially those involving abundant RNA- 
binding proteins, may likewise reflect 
concurrent binding to RNA rather than 
direct interactions. We identified sev- 
eral annotated DNA-binding proteins 
as RNA-dependent binders. These pro- 
teins may both bind DNA to regulate 
transcription and subsequently bind to 
the nascent RNA and regulate its sta- 
bility or translation in the cytoplasm, as 
a means of coordinating the transcrip- 
tional and post-transcriptional regula- 
tion of a given gene, a model that has 
been suggested by previous work (Cramer 
et al. 1997; McCracken et al. 1997; Harel- 
Sharvit et al. 2010; Bregman et al. 2011). 
In contrast to previous applications of 
mass spectrometry to the identification 
of RNA-protein interactions, our approach 
appears to be largely unbiased by protein 
abundance. Strikingly, -50% (114/220) 
of the RNA-dependent binders we iden- 
tified were already known to be RBPs 
(enrichment P-value 2 x 10~ 35 ), which 
is a considerable improvement over pre- 
vious approaches. 

The RBP Nab2 is involved in the end 
processing, polyadenylation, and export 
of poly(A) mRNA from the nucleus; 
we see both known and novel RNA- 
dependent interactions with Nab2 that 
are consistent with the existing model 
(Hector et al. 2002; Tran et al. 2007; Fasken 
et al. 2008; Iglesias and Stutz 2008; Tutucci 
and Stutz 2011). However, our data pro- 
vide new insight into the temporal program of Nab2 binding 
based on evidence for concurrent binding with RBPs involved in 
transcription, splicing, and translation. From its RNA-dependent 
interaction partners, we infer a model in which Nab2 binds 
cotranscriptionally and remains bound during splicing and end 
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Figure 7. Insights from RNA-dependent interactions with Puf3. This 
diagram depicts a subset of the proteins that we found to interact with 
Puf3 in an RNA-dependent manner. Proteins separated by a black line (the 
RNA) with Puf3 had RNA-dependent interactions. When limited to well- 
studied proteins known to bind RNA, these RNA-dependent interaction 
data suggest that these RBPs bind to the same RNAs at the same time as 
Puf3. Proteins were placed in this diagram according to their known roles 
in RNA regulation. 
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processing. Nab2 also appears to remain bound to mRNAs that fail 
splicing or are otherwise targeted for surveillance/degradation by 
the nuclear exosome. For mRNAs that pass nuclear quality control, 
Nab2 interacts with the nuclear pore to promote their export. After 
export into the cytoplasm, Nab2 remains bound to its mRNA tar- 
gets as they are bound by cytoplasmic translation regulatory pro- 
teins and perhaps until they initiate the first round of translation. 
An intriguing possibility is that some of these proteins typically 
involved in regulating translation in the cytoplasm may be loaded 
onto mRNAs before or during export, while Nab2 is still bound. 

The RBP Puf3 is known to promote the decay of its mRNA 
targets (Gerber et al. 2004; Saint-Georges et al. 2008; Lee et al. 2010; 
Quenault et al. 2011). We identified novel RNA-dependent inter- 
actions consistent with this role. Proteins that bind RNAs con- 
currently with Puf3 include candidates (Stml and the Patl/Lsml 
ring complex) for hypothetical factors recruited by Puf3 to promote 
the rearrangement of its mRNP structure from a pro-translation/ 
stability state into an anti-translation/decay state (Lee et al. 2010). 
We found that Puf3 binds to mRNAs at the same time as proteins 
that are involved in translational repression (Patl, Stml), found 
in P-bodies (Xrnl, Lsm ring), or found in stress granules (Sgnl, 
Publ), suggesting that Puf3 may repress its mRNA targets at the 
translational level as well. As a part of this model, these proteins 
may briefly physically interact with Puf3 as they are recruited to 
a Puf3-bound mRNA, but we would not necessarily expect to detect 
this interaction in our assay if at steady state a substantial fraction 
of these proteins remain bound to RNA but not Puf3. 

We identified as candidate RBPs 106 proteins that were not 
previously known to bind RNA. Of the 30 candidates we tested, 
23 (77%) bound directly to RNA in an independent assay. None of 
these 23 novel RNA-binding proteins have known RNA-binding 
domains, and many have unknown molecular functions. The 
known functions of the novel RBPs were diverse, including a vesi- 
cle trafficking protein (Seel 6), a transcriptional coactivator (Mbfl), 
a regulator of translational elongation (Stml), two helicases (Ecm32 
and Slhl), a metabolic enzyme (Imd4), a dynamin-like GTPase 
(Vpsl), and a histone acetyltransferase (Eaf3). We speculate that 
in some cases these unexpected RNA-protein interactions involv- 
ing proteins with already established biological functions that are 
apparently unrelated to RNA binding might have evolved to fa- 
cilitate mRNA localization. Specifically, RNAs may have evolved 
structured elements to bind to specific proteins with distinct lo- 
calization patterns (such as Imd4) to "hitch a ride" to, or hold their 
position in, a particular part of the cell. Overall, our high success 
rate for validating the RNA-binding activity of the proteins identi- 
fied as RDBs suggests that many of the 76 candidates we have yet 
to test may also bind directly to RNA (Supplemental Table S5). 

Existing methods that use microarrays or sequencing to iden- 
tify the RNA targets of specific RBPs can identify sets of RBPs that 
interact with common RNA targets. The method we describe here 
makes it possible to determine which of those interactions are 
concurrent. This is a crucial distinction, because while each RNA 
may be bound by several different RBPs over the course of its 
lifetime (Hogan et al. 2008), many of those RBPs may bind at 
different times or places within the cell. When applied to a spe- 
cific RNA-binding protein, the identity of other concurrently asso- 
ciated RBPs can provide clues to its position in the temporal se- 
quence of protein-RNA interactions and the subcellular location in 
which they occur. This approach could thus be broadly applicable to 
mapping relationships and connections in the RNA-protein net- 
work that affects the fate of each mRNA. A similar approach could 
also be used to identify proteins that bind to DNA concurrently. 



Methods 

RNA-dependent protein purification 

We grew TAP-tagged yeast strains (Pabl-TAP, Nab2-TAP, and Puf3- 
TAP) auxotrophic for lysine to mid-log phase in media with or 
without heavy labeled L-lysine. We lysed the cells and purified 
the RNA-binding proteins essentially as described previously 
(Tsvetanova et al. 2010), except that we split the beads equally 
after the initial washes and performed the subsequent three washes 
in buffer with or without RNase (in excess). Note: For the Pabl Mg 2+ 
purification, the wash buffers contained 1.8 mM MgCl 2 , but for all 
other purifications, the washes were done with buffer containing 
10 mM EDTA. Finally, we boiled the beads in Laemmli sample 
buffer and proceeded to analysis by mass spectrometry. A detailed 
protocol is available in the Supplemental Material. 

Quantitative mass spectrometry 

Proteins were separated by SDS-PAGE, and each lane was sliced 
into eight fractions, which were further minced. The minced gel 
pieces were then destained, minced, alkylated, and incubated 
overnight with LysC. The resulting peptides were then extracted 
from the gel, separated by capillary chromatography, and ana- 
lyzed by an LTQ-Orbitrap XL. The MS data were processed using 
the MaxQuant software suite (version 1.2.0.18) (Cox and Mann 
2008) and a yeast protein database (6717 entries and its reverse 
complement). For the search, oxidation on methionine and protein 
N-terminal acetylation were set as variable modifications. Pro- 
tease cleavage specificity was set to LysC. False discovery rates at 
the peptide and protein levels were set to 0.01, and only proteins 
with at least two quantitation events were considered for the 
subsequent bioinformatic analysis. A detailed protocol is avail- 
able in the Supplemental Material. 

Analysis of mass spectrometry data 

The forward experiment (heavy labeled without RNase over un- 
labeled with RNase) and the reverse experiment (heavy labeled 
with RNase over unlabeled without RNase) were analyzed by mass 
spectrometry separately and treated as replicates (except with in- 
verted heavy-to-light ratios). To generate a high-confidence data 
set, we filtered the mass spectrometry results for proteins for 
which we detected two peptides in both the forward and the re- 
verse experiment (that map to only one protein). To generate a 
background set of all proteins that had the opportunity to be 
detected in our assays, we used mass spectrometry data from an 
analysis of all proteins detected from a yeast whole-cell lysate and 
filtered for two peptides in at least two replicates. This back- 
ground set (Supplemental Table S7) was used to calculate en- 
richment of known RNA-binding proteins as well as Gene Ontology 
terms, KEGG pathways, and PFAM domains. We then inverted 
the heavy-to-light ratios for the reverse experiment, normalized 
the forward and reverse samples so that the ratio for the TAP- 
tagged protein was 1, and then averaged the forward and reverse 
values. We then took the log base 2 of these heavy-to-light ratios 
and worked with the data in this format from this point on [re- 
ferred to as log 2 (-/+) RNase ratios or RNA-dependence values]. 

To establish a conservative cutoff for the classification of 
proteins into RNA-dependent binders and protein binders based 
on their RNA-dependence values, we modeled the distributions 
of RNA-dependence values for proteins with RNA-independent 
interactions with Pabl, Nab2, and Puf3. To do this, we made two 
assumptions: first, that after normalization, any RNA-dependence 
values less than zero have a true value of zero and the observed 
variation from zero is due to noise; and, second, that this noise 
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is symmetric about zero. Using these two assumptions, we took 
the RNA-dependence values less than zero (excluding the most 
negative 1% as extreme outliers) and combined them with their 
absolute values to form a null distribution symmetric about zero 
(Supplemental Fig. S3). We used this null distribution to deter- 
mine an empirical FDR cutoff of 10% for the classification of 
RDBs (Supplemental Fig. S3). To evaluate this cutoff indepen- 
dently, we plotted the frequency of annotated RBPs in a sliding 
window versus the RNA-dependence values (Supplemental Fig. S4). 
This analysis revealed that the frequency of annotated RBPs was 
well above the median frequency for all proteins that could be 
detected from a yeast whole-cell lysate as well as the median fre- 
quency for all proteins detected in each purification experiment 
(Supplemental Fig. S4). Note that ribosomal proteins were ex- 
cluded from the analysis for Supplemental Figures SI and S4 be- 
cause they are common mass spectrometry contaminants, and 
they also often have an annotated molecular function of RNA 
binding. This serves as independent validation of the cutoff we 
made for classifying proteins as RDBs. 

We calculated the enrichment of Gene Ontology terms, PFAM 
domains, and KEGG pathways using the GOStats package in R 
(Falcon and Gentleman 2007). We corrected the resulting P-values 
for multiple hypothesis testing using the Bonferroni correction. 
We made the RNA-protein interaction network diagram with the 
program Cytoscape (Smoot et al. 2011). We made the diagrams 
depicting the RNA-dependent binding interactions as well as the 
method overview with the program OmniGraffle by the Omni 
Group. The protein abundance data we used were previously pub- 
lished (Ghaemmaghami et al. 2003). The set of high-confidence 
protein-protein interactions we used was from Collins et al. 
(2007). 

UV cross-linking assay 

To test whether our candidate RNA-binding proteins cross-link 
directly to RNA by UV irradiation, we used a method based on work 
by Greenberg (1979, 1980) and Ule et al. (2005). First, we cross- 
linked RNA to protein in vivo by UV irradiation and purified the 
TAP-tagged candidate RBPs under denaturing conditions. Then, 
we subjected each sample to limited digestion by MNase and then 
subjected half to further, exhaustive digestion by RNase. We next 
labeled the RNA fragments by polynucleotide kinase treatment 
and ran the samples on a denaturing SDS-PAGE gel. We looked 
specifically for the presence of a PNK-labeled band that was RNase 
sensitive and of corresponding size to the protein of interest. We 
also included 10 known RBPs as positive controls and five puta- 
tive negative-control proteins that were selected from among 
highly abundant proteins (95th percentile for abundance) for 
which we had no evidence to suggest that they bind RNA. Mol- 
ecules of cross-linked RNA were quantified by comparing the in- 
tensity of the radioactive band with a standard curve, and molecules 
of protein were estimated based on published protein abundance 
data and the number of cells used, the typical lysis efficiency, and 
the typical purification efficiency. A detailed protocol is available 
in the Supplemental Material. 

Identification of RNA-binding protein domains 

We prepared the protein samples exactly as they were for the 
UV-cross-linking assay described above, except that we scaled 
up everything 4 x . After we subjected the samples to exhaustive 
RNase digestion and radioactive labeling, we digested them with 
chymotrypsin, trypsin, or elastase ranging in concentration from 
0.1 mg/mL to 0.0001 mg/mL (10X dilutions). We analyzed the 
supernatants containing protein fragments liberated by protease 



digestion by SDS-PAGE, visualizing both total protein and ra- 
dioactive signal in the same gel. We analyzed the distinct bands 
by mass spectrometry, to map them to a specific position in the 
full-length protein. This information, combined with our observa- 
tion of which bands were radioactively labeled, allowed us to 
identify the regions of the protein that were cross-linked to RNA. A 
detailed protocol is available in the Supplemental Material. 

Data access 

Raw data are included as Supplemental Material with this manuscript. 
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